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Abstract — We present a novel algorithm that solves the turbo 
code LP decoding problem in a fininte number of steps by 
Euclidean distance minimizations, which in turn rely on repeated 
shortest path computations in the trellis graph representing the 
turbo code. Previous attempts to exploit the combinatorial graph 
structure only led to algorithms which are either of heuristic 
nature or do not guarantee finite convergence. A numerical study 
shows that our algorithm clearly beats the running time, up to a 
factor of 100, of generic commercial LP solvers for medium-sized 
codes, especially for high SNR values. 

Index Terms — LP decoding, turbo codes, combinatorial opti- 
mization 



I. Introduction 

Since its introduction by Feldman et al. in 2002 [1 1, Linear 
Programming based channel decoding has gained tremendous 
interest because of its analytical power — LP decoding exhibits 
the maximum likelihood (ML) certificate property |2|, and the 
decoding behavior is completely determined by the explicitly 
described "fundamental" polytope Ol — combined with note- 
worthy error-correcting performance and the availability of 
efficient decoding algorithms. 

Turbo codes, invented by Berrou et al. in 1993 |4|, are a 
class of concatenated convolutional codes that, together with 
a heuristic iterative decoding algorithm, feature remarkable 
error-correcting performance. 

While the first paper on LP decoding 1 1 1 actually dealt with 
turbo codes, the majority of publications in the area of LP 
decoding now focus on LDPC codes [51 which provide similar 
performance (cf. fSl for a recent overview). Nevertheless, 
turbo codes have some analytical advantages, most impor- 
tantly the inherent combinatorial structure by means of the 
trellis graph representations of the underlying convolutional 
encoders. ML Decoding of turbo codes is closely related 
to shortest path and minimum network flow problems, both 
being classical, well-studied topics in optimization theory for 
which plenty efficient solution methods exist. The hardness 
of ML decoding is caused by additional conditions on the 
path through the trellis graphs (they are termed agreeability 
constraints in |1|) posed by the turbo code's interleaver Thus 
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ML (LP) decoding is equivalent to solving a (LP-relaxed) 
shortest path problem with additional linear side constraints. 

So far, two methods for solving the LP have been proposed: 
General purpose LP solvers like CPLEX |7| are based on the 
matrix representation of the LP problem. They utilize either 
the simplex method or interior point approaches but do 
not exploit any structural properties of the specific problem. 
Lagrangian relaxation in conjunction with subgradient opti- 
mization fl], ||9l, on the other hand, utilizes this structure, 
but has practical limitations, most notably it usually converges 
very slowly. 

This paper presents a new approach to solve the LP 
decoding problem exactly by an algorithm that exploits its 
graphical substructure, thus combining the analytical power 
of the LP approach with the running-time benefits of a com- 
binatorial method which seems to be a necessary requirement 
for practical implementation. Our basic idea is to construct 
an alternative polytope in the space defined by the additional 
constraints (called constraints space) and show how the LP 
solution corresponds to a specific point of that polytope. 
Then, we show how to computationally find by a geo- 
metric algorithm that relies on a sequence of shortest path 
computations in the trellis graphs. 

The reinterpretation of constrained optimization problems 
in constraints space was first developed in the context of 
multicriteria optimization in |10|, where it is applied to mini- 
mum spanning tree problems with a single side constraint. In 
2010, Tanatmis |11| applied this theory to the turbo decoding 
problem. His algorithm showed a drastic speedup compared 
to a general purpose LP solver, however it only works for up 
to two constraints, while in real-world turbo codes the number 
of constraints equals the information length. 

By adapting an algorithm by Wolfe ||T2| to compute in a 
polytope the point with minimum Euclidean norm, we are 
able to overcome these limitations and decode turbo codes 
with lengths of practical interest. The algorithm is, compared 
to previous methods, advantageous not only in terms of 
running time, but also gives valuable information that can help 
to improve the error-correcting performance. Furthermore, 
branch-and-bound methods for integer programming-based 
ML decoding depend upon fast lower bound computations, 
mostly given by LP relaxations, and can often be significantly 
improved by dedicated methods that evaluate combinatorial 
properties of the LP solutions. Since our LP decoder contains 
such information, it could also be considered a step towards 
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Fig. 1: Turbo encoder with two convolutional encoders Ca, 
Cb and interleaver it. 



IP-based algorithms with the potential of practical implemen- 
tation. 

II. Background and Notation 
A. Definition of Turbo Codes 

A fc-dimensional subspace C of the vector space F2 (where 
F2 — {0, 1} denotes the binary field), is called an {n, k) binary 
linear block code, where n is the block length and k the 
information (or input) length. One way to define a code is by 
an appropriate encoding function ec, for which any bijective 
linear mapping from F2 onto C qualifies. This paper deals 
with turbo codes |4J, a special class of block codes built by 
interconnecting (at least) two convolutional codes (see e. g. 
IIT3I ). For the sake of clear notation, we focus on turbo codes 
as used in the 3GPP LTE standard [141 — i.e., systematic, par- 
allely concatenated turbo codes with two identical terminated 
rate-1 constituent encoders — despite the fact that our approach 
is applicable to arbitrary turbo coding schemes. An in-depth 
covering of turbo code construction can be found in ifTSl . 

An (71, k) turbo code TC = TC'{C, tt) is defined by a rate-1 
convolutional {nc',k) code C with constraint length d and a 
permutation tt £ such that n — k + 2 ■ nc- Because we 
consider terminated convolutional codes only (i. e., there is a 
designated terminal state of the encoder), the final d bits of 
the information sequence (also called the tail) are not free to 
choose and thus can not carry any information. Consequently, 
those bits together with the corresponding d output bits are 
considered part of the output, which yields uq = k + 2 ■ d 
and a code rate slightly below 1. Let ec ■ F| — > ¥2^' be the 
associated encoding function. Then, the encoding function of 
TC is defined as 



ere ■■ n - 



F: 



fe+2-nc 



{x I ec{x) I ec(7r(a;))) 

where Tr{x) = (a;7r(i), . • . , a;,T(fe))- In other words, the code- 
word for an input word x is obtained by concatenating 

• a copy of X itself, 

• a copy of X encoded by C, and 

• a copy of X, permuted by tt and encoded by C afterwards. 
Figure [T] shows a circuit-type visualization of this definition. 

B. Trellis Graphs of Convolutional Codes 

A convolutional code with a specific length is represented 
naturally by its trellis graph, which is obtained by unfolding 
the code-defining finite state machine in the time domain: Each 
vertex of the trellis represents the state at a specific point in 
time, while edges correspond to valid transitions between two 
subsequent states and exhibit labels with the corresponding 




Fig. 2: Excerpt from a trellis graph with four states and 
initial state 0. The style of an edge indicates the according 
information bit, while the labels refer to the single parity bit. 



input and output bit, respectively. The following description 
of convolutional codes loosely follows |6 Section V.C], albeit 
the notation slightly differs. 

We denote a trellis by T = (V, E) with vertex set V and 
edge set E. Vertices are indexed by time step and state; i. e.. 
Vi s denotes the vertex corresponding to state s G {0, . . . , 2^^ — 
1} at time i £ {1, . . . , k + d+1}. An edge in turn is identified 
by the time and state of its tail vertex plus its input label, so 
ei,s,6 denotes the edge outgoing from Vi^s with input bit b G 
{0,1}. We call vertical "slices", i.e., the subgraphs induced 
by the edges of a single time step, segments of the trellis. 
Formally, the segment at time i is 

= {V,,Ei) 

where = {vj^s eV : j e + 1}} 
and E^ = {e^^s,;, e E : j = i] . 

Because the initial and final state of the convolutional encoder 
are fixed, the leading as well as the trailing d segments contain 
less than 2^^ vertices. Figure [2] shows the first few segments 
of a trellis with d = 2. 

By construction, the paths from the starting node to the 
end node in a trellis of a convolutional code C are in one-to- 
one correspondence with the codewords of C: Let Ij C Ej 
and Oj C Ej be those edges of Sj whose input label 
and output label, respectively, is a 1. The correspondence 
between a codeword y G fk+2-d ^j^^ according path 
P = (ei, . . . , ek+d) in T is given by 

1 ^ 1^'^+' ^ for 1 < i < d 

yj = 1 ■( (1) 

I Ci^d G Oi^d for d<i<k + 2-d, 

where the first part accounts for the d "input" tail bits that are 
prepended by convention. From ([T]), for each e G E an index 
set Jc{s) can be computed with the property that e G P ^ 
yj = 1 for aU j G Jc{s)- In our case, | Jc(e)| varies from 
(for edges in Si, i < k, with output label 0) to 2 (for edges 
in Si, k + 1 < i < k + d, with both input and output label 1). 

The path-codeword relation can be exploited for maxi- 
mum likelihood (ML) decoding, if the codewords are trans- 
mitted through a memoryless binary-input output-symmetric 
(MBIOS) channel: Let A G IR'=+2 be the vector of LLR 
values of the received signal. If we assign to each edge e G E 
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the cost 

it can be shown ^ that the shortest path in T corresponds to 
the ML codeword. 



C. Trellis Representation of Turbo Codes 

For turbo codes, we have two isomorphic trelhs graphs, 
and T^, according to the two component convolutional 
encoders. Let formally T = {G'^ U G'^,E^ U E'^), and by 
P = o P"^ denote the path that consists of P^ in 
and P^ in T^. Only certain paths, called agreeable, 
actually correspond to codewords; namely, an agreeable path 
-Pi o Pa = {e\,...,el^^,el,...,el^^) must obey the k 
consistency constraints 



el} 



for i = 1, 



(2) 



because both encoders operate on the same information word, 
only that it is permuted for the second encoder. Consequently, 
ML decoding for turbo codes can be formulated as finding 
the shortest agreeable path in T. If an agreeable path contains 
e] e //, it must also contain , \ S r -\, and thus i G Jn(e) 
for both e\ and e^j-jj. To avoid counting the LLR value Xi 
twice in the objective function, we use the modified cost 



if 1 < j < A:, 



c(e) = = j / '\u " ■ " (3) 

, A , otherwise. 



Then, the ML decoding problem for turbo codes can be stated 
as the combinatorial optimization problem 

(TC-ML) mill E c(e) (4) 

s. t. pi is a path in (5) 
p2 is a path in (6) 
P is agreeable 

The codeword variables can be included into (TC-ML) by 
the constraints 



T for 1 < i < fc 

Jc(e)9i 

fe for i> k 



(7) 



where the factor ^ is analogical to ([3]l. However, these 
variables are purely auxiliary in the LP and thus not needed. 

It is straightforward to formulate TC-ML as an integer linear 
program by introducing a binary flow variable /e G {0, 1} for 
each e G E^U E"^. The constraints (|5]l and (|6]) can be restated 
in terms of flow conservation and capacity constraints |16| 
which define the path polytopes V^^^^^ and Ppath. respectively. 



By also transforming (|2| and we obtain 
(TC-IP) min 

eeE^UE^ 



path 



fe G {0, 1}, e e E. 



(8) 
(9) 

i = l,...,/c(10) 
(11) 



D. Polyhedral Theory Background 

Besides coding theory, this paper requires some bits of poly- 
hedral theory. A polytope is the convex hull of a finite number 
of points: V = conv {vi, . . . , Vn)- It can be described either 
by its vertices (or extreme points), i. e., the unique minimal 
set fulfilling this defining property, or as the intersection of a 



finite number of halfspaces: V ~ fXiLii 



< bi}. An 



inequality a^x < 6 is called valid for V if it is true for all 
X G v. In that case, the set Pq t ~ {x G V : a^x ~ 5} is 
called the face induced by the inequality. For any r satisfying 
a^r > b ia^r > b) we say that the inequality separates 
(strongly separates) r from V. 

III. The LP Relaxation and Conventional 
Solution Methods 

ML decoding of general linear block codes is known to be 
NP-hard iflTl . While the computational complexity of TC- 
IP is still open, it is widely believed that this problem is 
NP-hard as well, which would imply that no polynomial-time 
algorithm can solve TC-IP unless P = NF|^ By relaxing ( [TT] l 
to /e G [0,1], we get the LP relaxation (referred to as TC-LP) 
of the integer program TC-IP, which in contrast can be solved 
efficiently by the simplex method or interior point approaches 
|8l. Feldman et al. fl] were the first to analyze this relaxation 
and attested it reasonable decoding performance. 

A general purpose LP solver, however, does not make use 
of the combinatorial substructure contained in TC-IP via ([HJ 
and Q and thus wastes some potential of solving the problem 
more efficiently — while LPs are solvable in polynomial time, 
they do not scale too well, and the number of variables (about 
2-\V\ = (fc + d) • 2-^+2) and constraints (|V"| +fc) in TC-LP is 
very large (practical values of d range roughly from 3 to 8). 

Note that without the consistency constraints ( [TO| l, we could 
solve TC-LP by simply computing shortest paths in both trellis 
graphs, which is possible in time 0{k + d), even in the 
presence of negative weights, because the graphs are acyclic 
||T9|. A popular approach for solving optimization problems 
that comprise "easy" subproblem plus some "complicating" 
additional constraints is to solve the Lagrangian dual |20| by 
subgradient optimization. If we define gi{f) = Yeei^ /e ^ 
YeeP /c' the constraints ( fTO] ) can be compactly rewritten 
as 

9i{f) = Q ior i = l,...,k. 



'Note that with state-of-the-art software and prohibitive computational 
effort, ML turbo decoding can be simulated off-line on desktop computers; 
see [18] 
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The Lagrangian relaxation with multiplier /i G M'' is defined 
as 

k 

(TC-LR) = min ^ c(e) • /e + ^ Mfe ' 9i{I) 

/' e ^path 

fe e {0, 1}, e G £; 

For all /i G M''', the objective value of TC-LR is smaller 
or equal to that of TC-LR The Lagrangian dual problem 
is to find multipliers /i that maximize this objective, thus 
minimizing the gap to the LP solution. It can be shown 
that in the optimal case both values coincide. Note that the 
feasible region of TC-LR is the combined path polytope of 
both Ti and T2, so it can be solved by a shortest path 
routine in both trellises with modified costs, and the integrality 
condition on / is fulfilled automatically. Applying Lagrangian 
relaxation to turbo decoding was already proposed by Feldman 
et al. m and further elaborated by Tanatmis et al. |9]; the latter 
reference combines the approach with a heuristic to tighten the 
integrality gap between TC-LP and TC-IP. 

The Lagrangian dual is typically solved by a subgradient 
algorithm that iteratively adjusts the multipliers ji, converging 
(under some mild conditions) to the optimal value |20|. 
However, the convergence is often slow in practice and the 
limit is not guaranteed to be ever reached exactly. Additionally, 
the dual only informs us about the objective value; recovering 
the actual solution of the problem requires additional work. 
In summary, subgradient algorithms suffer from three major 
flaws. The main result of this paper is an alternative algorithm 
which exhibits none of these. 

TV. An Equivalent Problem in Constraints Space 

Like Lagrangian dualization, our algorithm also uses a 
relaxed formulation of TC-IP with modified objective function 
that resembles TC-LR. However, via geometric interpretation 
of the image of the path polytope in the "constraints space", 
as defined below, the exact LP solution is found in finitely 
many steps. 

A. The Image Polytope Q 

Let Vp-^th ■PpV X ^path be the feasible region of TC-LR. 
We define the map 

2) : T'path ^ M'-^+i 

f ^{giif ),■■■, 9k{f),cif)f 

where c(/) = ^eeE^uE^ ^i^) ' /e is a short hand for the 
objective function value of TC-LP. For a path /, the first k 
coordinates Vi,i — 1, k, of v = tell if and how the 

condition = is violated, while the last coordinate Vk+i 
equals the cost of /. Let Q = ©(T'path) be the image of the 
path polytope under 2). The following results are immediate: 

Lemma 1: 

1) Q is a polytope. 



2) If / represents an agreeable path in T, then is 
located on the (fc + l)st axis (henceforth called c-axis 
or A^). 

3) If u is a vertex of Q and v — '!){/) for some / G T^path, 
then / is also a vertex of Vpath- 

In the situation that v = we will also write / — 'D^^{v) 
with the meaning that / is any preimage of v, which need not 
be unique. 

We consider the auxiliary problem 

(TC-LPg) z^p = min Vk+i 
s.t. V e Q 

veA, (12) 

the solution of which is the lower "piercing point" of the axis 
Ac through Q. Note that due to ( [T2] i, k of the fc+1 variables in 
TC-LP(Q) are fixed to zero, thus the problem is in a sense one- 
dimensional, the feasible region being the (one-dimensional) 
projection of Q onto Ac- Nevertheless, the following theroem 
shows that TC-LPg and TC-LP are essentially equivalent. 

Theorem 1: Let wlp be an optimal solution of TC-LPg 
with objective value and /lp = 2)~^(z;lp) G T'path the 
corresponding flow. Then = zlp, the optimal objective 
value of TC-LP, and /lp is an optimal solution of TC-LP. 

Proof: First we show < zlp. Let /lp be an optimal 
solution of TC-LP with cost c(/lp) — z^p. Then 2)(/lp) = 
(0, . . . , 0, zlp) by definition of 33, since /lp is feasible and 
thus .91 (/lp) = • • • = gkifis) = 0. Hence D(/lp) G n Q 
with £'(/Lp)fe+i = zlp, from which it follows that z^^ < zlp. 

If we assume on the other hand that z^p < zlp, there must 
he a V E Ac n Q such that Vk+i < zup. By definition of D 
this implies the existence of a flow / = S)^^(w) with gi{f) = 
■ ■ ■ = gk{f) = 0, hence a feasible one, and c(/) = Vk+i < 
Zlp, contradicting optimality of zlp. ■ 
While we do not have an explicit representation of Q — ^by 
means of either vertices or inequalities — at hand, we can easily 
minimize linear functionals over Q: 

Observation 1: The problem 

(LPg) min7"^w 
s. t. u G Q 

can be solved by first computing an optimal solution /* of 
the weighted sum problem 

fe 

(TC-WS) min ^ 7, • g,{f) + 7^+1 • c(/) 

i=l 

S.t. / G "Ppath 

and then taking the image of /* under 3. As noted before, 

this can be achieved within running time 0{n). 

Note that TC-WS is closely related to TC-LR: as long as 

7fe+i ^ 0, we get the same problem by setting jii = 

in TC-LR. ''"^^ 

B. Solving TC-LPq with Nearest Point Calculations 

Our algorithm solves TC-LPg by a series of nearest point 
computations between Q and reference points on Ac, the 
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last of which gives a face of Q containing the optimal solution 

vlp- 

For each r £ we denote by 

NP(r) = argmin \\v — rW^ 

v<£Q 

the nearest point to r in Q with respect to Euclidean norm 
and define 

a{r) = r- NP(r) 
h(r) = a(r)^NP(r). 

The following well-known result will be used frequently 
below. 

Lemma 2: The inequality 

a{rfv < b{r) (13) 

is valid for Q and induces a face containing NP(r), which we 
call NF(r). If r ^ Q, ( pj) strongly separates r from Q. 
The following theorem is the foundation of our algorithm. 

Theorem 2: There exists an e > such that for all r inside 
the open line segment (wlp, Wlp ~ (0, • . . , 0, e)"^) the condition 

VLP e NF(r) 

holds. 

Our constructive proof of Theorem |2] shows how find a point 
inside the interval mentioned in the theorem. The outline is 
as follows: At first, start with a reference point r £ Ac that 
is guaranteed to be located below wlp. Then, we iteratively 
compute NF(r) and update r to be the intersection of Ac 
with the hyperplane defining NF(r). The following lemmas 
show that this procedure is valid and finite. 

The first result is that the hyperplane defining NF(r) is 
always oriented "downwards". 

Lemma 3: Let r = (0, . . . , 0, p)^ with p < and let 
a{r)'^v < b{r) be the inequality defined in ( [T3| l. Then, 

a{r)k+i < 0. 

Proof: Assuming a{r)k+i > 0, we obtain a{r)'^VLp = 
a(^)/c+i-2LP — o,(j')k+iP = a(r)^r > b{r), which contradicts 
Vlp G Q by Lemma |2] Note that the equalities hold because 
both ulp and r are elements of Ac, the first inequality stems 
from the assumptions on a{r)k+i and p, and the second 
follows from Lemma |2] ■ 
Next we show that updating r leads to a different nearest face, 
unless we have arrived at the optimal solution. 

Lemma 4: Under the same assumptions as in Lemma [3] let 

S = (0, . . . , 0, Sfc+i) with Sfc+i = a{rh+i "^^^^'^ 

the separating hyperplane and Ac intersect. If NF(r) ~ NF(s), 
then s = v^p. 

Proof: We use contraposition to show that s ^ ulp 
implies NF(r) 7^ NF(s), so assume s ^ wlp. We know that 
a{j)^v < h{r) is valid for Q and a{r)^ s — a{r)k+iSk+i = 
6(r) by construction. This implies that s ^ Q; otherwise 
we would have s — v^p because for all ( < Sk+i, 
a(r)-^(^efe+i) > b{r), so s would really be the lowest point 
on Ac that is also in Q and thus optimal. 

It follows that UN — NP(s) 7^ s. Since un ^ Q and 
a{r)'^v < b{r) is valid for Q, we have a{r)'^yN < b{r). 



Case 1: air^yN < b{r). Then <^ NF(r), but yN S 
NF(s) by definition, which proves the claim for this case. 

Case 2: a[r)^yN — b{r). From a{r)'^r > b^ and a{r)'^s = 
b{r) we obtain 

=>a(r)^(r - s) > 
=^a{r)k+i{rk+i - Zfe+i) > 
=^rfe+i<Sfc+i, (14) 

where we have used again a{r)k+i < and the fact that 

r,s E Ac- 

Applying Lemma [3] to s as reference point we obtain 

a{s)k+i = (s - yN)k+i < 0, hence 

{yN)k+i > Sfc+i 

(yjv)/c+i(s/c+i - ''fe+i) > Sfc+i(sfc+i - rfc+i) by ([T4| 
-r) > s^(s - r) 
=^ VnS - yjjr + s^r - s'^s > (15) 

Plugging the definitions into a{r)'^yM — b{r) = a{r)'^s 
yields (r — xn^Vn — {f ^ NP(r))^s or r'^s — r^yn = 
NP(r)^s — NP(r)^?/Ar. Using this we continue from ([TSj with 

^ yj^s + NP(r)^s - ^¥{rfyN - s > 

^NP{rfis-yN)>s^{s-yN) 
^a{sfNP{r) > a{sfs > b{s) 

Thus, NP(r) i NF(s) ^ {v e Q : a{s)^v < b{s)}, but 
NP(r) e NF(r) by definition, so those faces must differ. ■ 
Now we show the auxiliary result that if two inequalities 
induce the same face, then also every convex combination of 
them does. 

Lemma 5: Let V he a polytope, x^,x'^ G V, and r^,r^ ^ 
v. If the inequalities 

and 

Tj2 2 2\T ^ / 2 2\T 2 

H : [r — X ) X < [r — x ) x 
both induce the same face F of V, then also 

H : {f — x)^ X < (f — x)'^ X 

with f = Ari + (1 - A)r2, x ^ Xx"^ + {I ~ \)x'^, < A < 1, 
is valid and induces F. 

Proof: We first show that H is valid. For x E V 

{r - xfx = X{r^ - x^fx + (1 - X){r^ - x^fx 
< A(ri - xYx' + (1 - A)(r2 - xY^^ 
= {X{r^ - x^) + {I - X){r^ ~ x^)f 

{Xx^ + (1 - X)x^) 
= (f — x)^x, 

where we have used the fact that is satisfied with equality 
for X — x^ and vice versa because of the assumptions. Since 
we have shown that H is valid, it must induce a face F. It 
remains to show that F ~ F. 
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"F C F": X e F fulfills both and H'^ with equahty, 
so we can cany out the above calculation with a "=" in the 
second line to conclude x ^ F. 

"F C F": Let x € F and assume x ^ F, which implies 
(r* - a;*)^x < (r* - x')'^x' for i G {1,2}. Then X{r^ - 
xi)^:Ei + (l-A)(r2-z2)^x2 > A(ri-xi)^x + (l-A)(r2- 
= (r — x)'^^ = (f — = A(r"'" — x^)'^x^ + (1 — 

A)(r^ — which is a contradiction. ■ 

The above lemma is used to show that the part of Ac that lies 
below wlp dissects into intervals such that reference points 
within one interval yield the same face of Q. 

Lemma 6: If < < and NF(r^) = NF{r^), then 
NF(r) = NF(ri) for all r e [r^,r^]. 

Proof: Let v' = NP(r') for i G {1, 2}. By Lemma [s] for 
each A G (0, 1) and f = Ar^ + (1 - A)r^ v^Xv^ + {l- X)v^, 
it holds 

{u G Q : (f - v)'^v = (f - = NF(ri), 

and applying the converse statement from Lemma [2] follows 
V = NP(f), so NF(f) = NF(ri) as claimed. ■ 
Now we have alle the ingredients at hand to prove our theorem. 

Proof of Theorem ^ First we show that there exists at 
least one r with the desired properties. 

Choose some arbitrary r" G with r°_|_j^ < z^p (thus r" ^ 
Q). If wlp G NF(7-'^), we are done. Otherwise, Lemma [4] tells 
us how to find an with rj,^^ > r"^-^ such that NF(r^) ^ 
NF(r''). Iterating this argument and assuming that ulp is never 
contained in the induced face results in a sequence (r*)i with 
r}X\ > ^k+i for Because of Lemma [S] NF(r*+^) 7^ 

NF(r*) implies NF(r*+i) ^ NF(r') for all < Z < i + 1, so 
that all NF(r*) are distinct. But since there are only finitely 
many faces of Q, this can not be true, so eventually there must 
be an i* such that wl? G NF(r*'). 

Now let r* G Ac be any such point whose existence we 
have just proven, wtv — NP(r*) and A G (0,1]. Let f — 
Xr* + (1 — A)ulp and v = Xvjy + (1 — A)ulp- We use similar 
arguments as in the proof of Lemma |5] to show that 

{f — v)'^ V < (f ~ v)'^ V (16) 

induces NF(r*). 
For V G Q, 

(f - v)'^v = (A(r* - vn) + (1 - A)(t;LP - wlp))^^^ 
= A(r* — vn)^v 
< X{r* ~ vn)'^vn 

- A(A(r* - VN fvN + (1 - A)(r* - v^fvi^p) 
= A(r* - vn)'^{Xvn + (1 - A)ulp) 
~ [f — v)"^ V. 

So the inequality is valid, and since again for v G NF(r*) 
equality holds in the third line, we know that the face F 
induced by ([T6]l contains NF(r*). 

Now let V G F,\. e., {f—xY'v = {f—x)'^x. From the above 
equations we conclude A(r* — vn)'^v = X{r* — vn)'^vn, and 
because A > this implies v G NF(r*). 

Because the above holds for any < A < 1, we can choose 
f arbitrarily close to vlp on Ac, which completes the proof. 



An illustration of the process in two dimensions is given in 
Figure [3] 

C. Solving the Nearest Point Problems 

It remains to show how to solve the nearest point problems 
arising in the discussion above. To that end, we utilize an 
algorithm by Wolfe [12] that finds in a polytope the point 
with minimum Euclidean norm. Wolfe's algorithm elaborates 
on a set of vertices of the polytope that are obtained via 
minimization of Unear objective functions. In our situation, 
this means that LPq has to be solved repeatedly, which by 
Observation [T] boils down to the linear-time solvable weighted 
sum shortest path problem TC-WS. Note that by subtracting 
r from the results of LPg and adding r to the final result, 
the algorithm can be used to calculate the minimum distance 
between Q and r also in the case r 7^ 0. 

The algorithm in |12| maintains in each iteration a subset 
P of the vertex set V{Q) and a point x such that x = 
NP(aff(P)) lies in the relative interior of conv(P), where 
afF(P) is the affine hull of P. Such a set is called a corral, 
and we denote the nearest point in aff(P) by i;|F. 

Initially P = {wg} for an arbitrary vertex uq and x = wq- 
Note that then — and P is indeed a corral. Then the 
following is executed iteratively (we explain afterwards how 
the computations are actually carried out): 

1) Solve p = argmin^gQ(a;"^w). 

2) If p = (0 is optimal) or x'^p — x'^x (x is optimal), 
stop. Otherwise, set P :— PU{p} and compute y = t;|f . 

3) If y is in the relative interior of conv(P), P is a corral. 
Set X := y and continue at[T]i. 

4) Determine z G conv(P) n convja;, y} with minimum 
distance to y; z will be a boundary point of conv(P). 

5) Remove from P some point that is not on the smallest 
face of conv(P) containing z, set x := z, and continue 
at[3j. 

The algorithm will eventually find a corral P such that the 
nearest point of Q equals 

The computations in each step are performed as follows: 

1) This matches the solution of TC-WS. 

2) If we interchangeably use the symbol P for both the set 
of points and the matrix that contains the elements of P 
as columns, every v G aff(P) can be characterized by 
some A G M'^' such that v — PA and e-^A — 1. Thus, 
the subproblem of determining ti|F can be written as 

min IIPAII2 = X^P^PX 
s.t. e^A = l. 

It can be shown lfT2l that this is equivalent to solving 
the system of linear equations 

(ee^ + P^P)\Ji = e 



As an efficient method to solve ([TTji, Wolfe suggests to 
maintain an upper triangular matrix R such that EF R = 
ee^ + P^P. Then the solution /i can be found by first 



7 




(a) Step i: = is found as nearest point to some refer- (b) Step i + 1: Note that the induced face of Q here is a facet, 

ence point r'~^. The intersection of the separating hyperplane while it was a 0-dimensional face in step i. 
with the axis Ac, r^, will be the reference point of the next 
iteration. 




. The 



intersects Ac at t^LP, but the algorithm can not yet detect this. solution D ^{vw) is returned. 

Fig. 3: Schematic execution of Algorithm [l] in image space 



solving R^jl = e for p. and then Rfi — p. for /i; both 
can be done by a simple backward substitution. When 
P changes, R can be updated relatively easily without 
the necessity of a complete recomputation 1121 . 

3) y is in the relative interior of conv(P) if and only if all 
coefficients in the convex representation of y satisfy 
A, > 0. 

4) By construction x E conv(P). Let x = J2veP '^'"^ 

y = J2veP A^''^' ■^here J^veP = J2veP = 1, but 
fip < for at least one p. The goal can then be restated 
as finding the minimal 9 E [0, 1] such that zg — dx + 
(1 — 9)y £ conv(P). Substituting the above expressions 
yields 

ze = ^(0A„ + (l-(?K)i>, 

veP 

and the condition is that all coefficients are nonnegative. 
Thus, for all v with < 0, 

lip Vp 

must hold. In summary, 6 can be computed by the rule 
9 — min < 1, max < — — — : fj,p < 0> > . 



5) A point not contained in the smallest face of conv(P) 
containing z is not needed for the convex description of 
z — J2veP ^^^^ '-^ identified by A^, = 0. 

D. Recovering the Optimal Flow and Pseudocodeword 

So far we have shown how to compute the optimal objective 
value, but not the LP solution, i. e. the flow /lp G "Ppath and 
the (pseudo)codeword y. The algorithm yields its solution vlp 
by means of a convex combination of extreme points of Q: 

t t 
vw^'^Kvi, Ai > 0, Aj = 1. 

i=l i=l 

During its execution the preimage paths fi = D^^{vi) can be 
stored alongside with the Vi. Then, the LP-optimal flow /l? 
is obtained by summing up the paths with the same weight 
coefficients A, i. e., 

t 

/lp = ^A,S)-1(«,). 

i=l 

In order to get the corresponding pseudocodeword, a simple 
computation based on (|7| suffices. 



g 



For most applications, however, the values of y are of 
interest only in the case that the decoder has found a valid 
codeword, i.e., t = 1 in the above sum. In such a case, the 
most recent solution of (TC-WS) is an agreeable path that 
immediately gives the codeword. No intermediate paths have 
to be stored, which can save a substantial amount of space 
and running time. 

E. Efficient Reference Point Updates 

As suggested by the proof of Theorem [2j the nearest point 
algorithm is run iteratively, and between two runs the k + 1st 
component of r is increased by means of the rule rk+i = 
This section describes how some information from 



Algorithm 1 Combinatorial Turbo LP Decoder (CTLP) 



bir) 



the previous iteration can be re-used to provide an efficient 
warm start for the next nearest point run. 

Assume that in iteration i the point NP(r*) — has been 
found, inducing the face NF(r*) defined by a{r^)^v < b{r^) 
of Q. Recall that NPA internally computes the minimum I2 
norm of Q — r*. Thus, it outputs = 
convex combination of t < fc + 1 points v 







r* as the 
r*, all of 



which are located on the corresponding face NF(r'*) of 



r': 



t 



In the subsequent nearest point calculation, the norm of Q— 
r*+^ is minimized. Obviously NF(7'*) corresponds to a face 
NF(r'+^) of Q — 7''+^, and we can initialize the algorithm 
with that face by simply adding r* 
j = 1, . . . ,t, which yields 



„i+l 



to V and each Vi 



and all 



are vertices of Q — r*+^. Note that r* — r'+^ 
is zero in all but the last component, so this update takes only 
t < fc + 1 steps. 

In order to warm-start the nearest point algorithm, the 
auxiliary matrix R has to be recomputed as well. Using its 
definition 

R^R ee^ + V'^V 

we can efficiently compute R by Cholesky decomposition. 
After these updates we can directly start the nearest point 
algorithm in Step 2. Numerical experiments have shown that 
this speeds up LP decoding by a factor of two. In particular, the 
computation time of the Cholesky decomposition is negligible. 

V. The Complete Algorithm 



Algorithm [T] formalizes the procedure developed in Sec- 



tion 



IV 



in pseudocode. The initial reference pomt r" is 
generated by first minimizing c(/) on Ppath (thus solving 
TC-WS with 7 = (0, . . . , 0, 1) and projecting the result in 
constraints space onto Ac (Line |4]i. Thereby we ensure that 
either r° ^ Q or it is located on the boundary of Q, in which 
case it already is the optimal LP solution. The solution of the 
nearest point problem and the recovery of the original flow 
are encapsulated in Lines |7] and |8] 



Initialize edge cost c(/) by ([3]) 
f ^ argmin{c(/) : / £ T'path}- 

rO^(0,...,0,T;,Vr 
i ^ 

while 7^ do 



i+l 



^ NP(r*) = argmin^jgg ||u ■ 



V 



i ^ i ^ 
end while 
return /* 



0, 



^ 



VI. Numerical Results 
A. Running Time Comparison 

To evaluate the computational performance of our algo- 
rithm, we compare its running time with the commercial 
general purpose LP solver CPLEX Q which is said to be 
one of the most competitive implementations available. 

Simulations were run using LTE turbo codes with block- 
lengths 132, 228, and 396, respectively, and a three- 
dimensional turbo code with blocklength 384 (taken from 
|21|) with various SNR values. For each SNR value, we have 
generated up to 10^ noisy frames, where the computation was 
stopped when 200 decoding errors occured. This should ensure 
sufficient significance of the average results shown in Tables |l|- 

m 

TABLE I: Average CPU time per instance (in j^s of seconds) 
for the (132,40) LTE turbo code 



SNR 





1 


2 


3 


4 


5 


CPLEX 


9.1 


9.5 


9.6 


9.6 


9.6 


9.8 


CTLP 


1.4 


0.9 


0.5 


0.29 


0.24 


0.22 


ratio 


6.5 


10.6 


19 


33 


40 


45 



TABLE II: Average CPU time per instance (in j^s of seconds) 
for the (228, 72) LTE turbo code 



SNR 





1 


2 


3 


4 


5 


CPLEX(xlO-^) 


3.1 


3.4 


4.2 


4.6 


4.7 


4.7 


CTLP(xlO"i) 


0.7 


0.4 


0.15 


0.05 


0.04 


0.04 


ratio 


4.4 


8.5 


28 


92 


118 


118 



TABLE III: Average CPU time per instance (in j^s of seconds) 
for the (396, 128) LTE turbo code 



SNR 





1 


2 


3 


4 


CPLEX 


4.4 


4.2 


3.6 


3.3 


3.2 


CTLP 


6.3 


4.1 


0.6 


0.09 


0.08 


ratio 


0.7 


1 


6 


37 


40 



As one can see, the benefit of using the new algorithm is 
larger for high SNR values. This becomes most eminent for the 



9 




12 3 4 
SNRb(dB) 



Fig. 4: CPU time comparison for the (132,40) LTE Turbo 
code depending on the SNR value (note the logarithmic time 
scale). 

TABLE IV: Average CPU time per instance (in seconds) for 
a (384, 128) 3-D turbo code 



SNR 





1 


2 


3 


4 


CPLEX 


1.4 


1.2 


0.9 


0.72 


0.57 


CTLP 


4.5 


3.1 


0.8 


0.04 


0.014 


ratio 


0.31 


0.39 


1.1 


18 


41 



3-D code for which the dimension of Q is the highest, where 
the new algorithm is slower than CPLEX for SNRs up to 2. 
The reason for this behavior can be explained by analyzing 
statistical information about various internal parameters of the 
algorithm when run with different SNR values: 

• The average dimension of the optimal nearest face, found 
in the last iteration of the algorithm, drops substantially 
with increasing SNR. Intuitively, it is not surprising that 
finding a face that needs less vertices to describe can be 
found more efficiently. 

• In particular, the share of instances for which the LP 
solution is integral (and thus, the face dimension is zero) 
increases with the SNR. 

• There are some trivial instances where the initial short- 
est path among both trellis graphs is already a valid 
codeword. This occurs more often for low channel noise 
and allows for extremely fast solution (no nearest point 
calculations have to be carried out). 

• The average number of major cycles of the nearest point 
algorithm performed per instance is seen to drop rapidly 
with increasing SNR. 

• Likewise, the the average number of main loops (Line [6] 
of Algorithm [TJ drops, reducing the required calls to 
CTLP 

Table [V] exemplarily contains the average per-instance values 
of these parameters for the (132,40) LTE code and SNRs 0, 
2, and 4. 



TABLE V: Statistical data for the (132, 40) LTE turbo code; 
average per-instance counts 



SNR 





2 


4 


face dim 


25.2 


3.6 


0.01 


integral 


0.26 


0.89 


0.9995 


trivial 





0.13 


0.64 


major 


221 


53 


4 


main loops 


4.36 


1.9 


0.7 




12 3 4 
SNR6(dB) 



Fig. 5: Average per-instance CPU time spent on various 
subroutines of the algorithm decoding the (132, 40) LTE code 
(SP=shortest path, LstSq=solution of least squares problems, 
GenSol=generation of solution in path space). 

B. Numerical Stability 

For larger codes, the dimension of Q becomes very large 
which leads to numerical difficulties in the nearest point 
algorithm: the equation systems solved during the execution 
sometimes have rank "almost zero" which leads to division by 
very small numbers, resulting in the floating-point value NaN. 
Careful adjustment of the tolerance values for equivalence 
checks help to eliminate this problem at least for the block 
lengths presented in this numerical study. 

In addition, it has proven beneficial to divide all objective 
values by 10 in advance. Intuitively, this compresses Q along 
the c-axis, evening out the extensiveness of the polytope in 
the different dimensions (note that for all axes other than c, 
the values only range from —1 to 1). 

VII. Improving Error-Correcting Performance 

As discussed above. Algorithm [T] can be easily modified to 
return a list of paths fi, i = 1, . . . ,t, such that the LP solution 
is a convex combination of that paths. Each fi can be split into 
a paths fl and ff through trellis and T^, respectively. 
A path in a trellis, in turn, can uniquely be extended to a 
codeword. Thus, we have a total of 2t candidate codewords. By 
selecting among them the codeword with minimum objective 
function value, we obtain a heuristic decoder {Heuristic A in 
the following) that always outputs a valid codeword, and has 
the potential of a better error-correcting performance than pure 
LP decoding. 
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1 2 3 4 5 
SNRfc(dB) 

Fig. 6: Decoding performance of the proposed heuristic en- 
hancements on the (132,40) LTE turbo code. 

A sHghtly better decoding performance, at the cost of once 
more increased running time, is reached if we consider not 
only the paths that constitute the final LP solution but rather 
all intermediate solutions of TC-WS. We call this modification 
Heuristic B. 

Simulation results for the (132,40) LTE code are shown in 
Figure |6] As one can see, the frame error rate indeed drops 
notably when using the heuristics, but for low SNR values 
there still remains a substantial gap to the ML decoding curve. 
At 5dB, Heuristic B empirically reaches ML performance; 
for lower SNR values it is comparable to a Log-MAP turbo 
decoder with 8 iterations. 

VIII. Conclusion and Outlook 

We have shown how the inherent combinatorial network- 
flow structure of turbo codes in form of the trellis graphs can 
be utilized to construct a highly efficient LP solver, specialized 
for that class of codes. The decrease in running time, compared 
to a general purpose solver, is dramatic, and in contrast to 
classical approaches based on Lagrangian dualization, the 
algorithm is guaranteed to terminate after a finite number of 
steps with the exact LP solution. 

It is still an open question, however, if and how the LP 
can be solved in a completely combinatorial manner. The 
nearest point algorithm suggested in this paper introduces a 
numerical component, which is necessary at this time but 
rather undesirable since it can lead to numerical problems in 
high dimension. 

Another direction for further research is to examine the 
usefulness of our decoder as a building block of branch-and- 
bound methods that solve the integer programming problem, 
i. e., ML decoders. Several properties of the decoder suggest 
that this might be a valuable task. For instance, the shortest 
paths can be computed even faster if a portion of the varibales 
is fixed, or the algorithm could be terminated prematurely if 
the reference point exceeds a known upper bound at the current 
node of the branch-and-bound tree. 



Finally, the concepts presented here might be of inner- 
mathematical interest as well. Optimization problems that 
are easy to solve in principle but have some complicating 
constraints are very common in mathematical optimization. 
Being able to efficiently solve their LP relaxation is a key 
component of virtually all fast exact or approximate solution 
algorithms. 

Acknowledgements 

We would like to acknowledge the German Research Coun- 
cil (DFG), the German Academic Exchange Service (DAAD), 
and the Center for Mathematical and Computational Modelling 
((CM)-^) of the University of Kaiserslautern for financial 
support. 

References 

[1] J. Feldman, D. R. Karger, and M. Wainwright, "Linear programming- 
based decoding of turbo-like codes and its relation to iterative ap- 
proaches," in Proc. 40th AUerton Conf. Commim. Control Computing, 
2002. 

[2] J. Feldman, "Decoding error-correcting codes via linear programming," 
Ph.D. dissertation, Massachusetts Institute of Technology, 2003. 

[3] R O. Vontobel and R. Koetter, "Graph-cover decoding and finite- 
length analysis of message-passing iterative decoding of LDPC codes," 
arXiv:c.s/0512078vl [cs.lT], 2005. 

[4] C. Berrou, A. Glavieux, and R Thitimajshima, "Near shannon limit 
error-correcting coding and decoding: Turbo-codes," in IEEE Int. Conf. 
Commun., May 1993, pp. 1064-1070. 

[5] R. G. Gallager, "Low-density parity-check codes," IRE Trans. Inf. 
Theory, vol. 8, no. 1, pp. 21-28, Jan. 1962. 

[6] M. Helmling, S. Ruzika, and A. Tanatmis, "Mathematical programming 
decoding of binary linear codes: Theory and algorithms," IEEE Tram. 
Inf Theon,; vol. 58, no. 7, pp. 4753^769, Jul. 2012. 

[7] "IBM ILOG CPLEX optimization studio," Software Package, 2011, 
version 12.4. 

[8] A. Schi'ijver, Theory of linear and integer programming. John Wiley 
& Sons, 1986. 

[9] A. Tanatmis, S. Ruzika, and F. Kienle, "A Lagrangian relaxation based 
decoding algorithm for LTE turbo codes," in Proc. Int. Symp. Turbo 
Codes and Iterative Inf Proc, Brest, France, Sep. 2010, pp. 369-373. 

[10] S. Ruzika, "On multiple objective combinatorial optimization," Ph.D. 
dissertation. University of Kaiserslautern, 2007. 

[11] A. Tanatmis, "Mathematical programming approaches for decoding of 
binary linear codes," Ph.D. dissertation. University of Kaiserslautern, 
Kaiserslautern, Germany, Aug. 2010. 

[12] P. Wolfe, "Finding the nearest point in a polytope," Math. Program., 
vol. 11, pp. 128-149, 1976. 

[13] D. J. C. MacKay, Information Theory, Inference, and Learning Algo- 
rithms. Cambridge University Press, 2003. 

[14] TS 36.212 vll.0.0: LTE E-UTRA Mutliplexing and Channel 
Coding, 3rd Generation Partnership Project (3GPP) Std., Oct. 
2012. [Online]. Available: http://www.etsi.org/delive r/etsi_ts/r36200_l 
136299/136212/1 1.00.00_60/ts_136212vl 10000p.pdf 

[15] S. Lin and D. Costello, Jr., Error Control Coding, 2nd ed. Upper Saddle 
River, NJ: Prentice-Hall, Inc., 2004. 

[16] R. K. Ahuja, T. L. Magnanti, and J. B. Orlin, Network Flows. Prentice- 
Hall, 1993. 

[17] E. Berlekamp. R. J. McEliece, and H. C. A. van Tilborg, "On the 
inherent intractability of certain coding problems," IEEE Trans. Inf. 
Theory, vol. 24, no. 3, pp. 954-972, 1978. 

[18] A. Tanatmis, S. Ruzika, M. Punekar, and F. Kienle, "Numerical compar- 
ison of IP formulations as ML decoders," in IEEE Int. Conf. Commun., 
2010, pp. 1-5. 

[19] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction 
to Algorithms, 2nd ed. MIT Press. 2001. 

[20] G. L. Nemhauser and L. A. Wolsey, Integer and Combinatorial Op- 
timization. Wiley-Interscience series in discrete mathematics and 
optimization, John Wiley & Sons, 1988. 

[21] E. Rosnes, M. Helmling, and A. Graell i Amat, "Pseudocodewords of 
linear programming decoding of 3-dimensional turbo codes," in Proc. 
IEEE Int. Symp. Inform. Theory, St. Petersburg, Russia, Jul. / Aug. 201 1, 
pp. 1643-1647. 



